feat: native arm64 / Apple Silicon build support + INSTALLATION.md#33
Open
johan-stph wants to merge 14 commits into
Open
feat: native arm64 / Apple Silicon build support + INSTALLATION.md#33johan-stph wants to merge 14 commits into
johan-stph wants to merge 14 commits into
Conversation
- CMakeLists.txt: auto-detect Apple Clang and wire up Homebrew libomp for OpenMP support on both arm64 and Intel macOS. No manual cmake flags required; a clear error message is shown if libomp is missing. - Add INSTALLATION.md with step-by-step instructions for Linux, macOS Intel, macOS Apple Silicon, Bioconda, and a quick sample run. - Update README.md to link to INSTALLATION.md and note the arm64 caveat next to the Bioconda section.
- Bump cmake_minimum_required to 3.5...3.31 (cmake 4.x dropped <3.5 support) - Find brew at known Homebrew prefixes (/opt/homebrew/bin, /usr/local/bin) since cmake's PATH does not include Homebrew by default - Add fallback candidates for libomp keg path when brew lookup fails - Add CMAKE_EXE_LINKER_FLAGS with -L/-lomp so the linker resolves __kmpc_* symbols from libomp.dylib (compile flags alone are not enough) Tested: pureclip 1.3.1 + winextract build and run as native arm64 Mach-O on Apple Silicon (AppleClang 21, cmake 4.3.3, libomp 22.1.6, gsl 2.8).
Two jobs, arm64 only:
- build-macos-arm64 (macos-15, M-series)
* Installs gsl + libomp + boost via Homebrew
* cmake 4.x pre-installed on runner; our CMakeLists.txt Apple Clang
detection handles libomp automatically
* Passes -DCMAKE_PREFIX_PATH for brew boost and -DGSL_ROOT_DIR
* Verifies both binaries run and are native arm64 Mach-O
- build-linux-arm64 (ubuntu-24.04-arm)
* Installs cmake + g++ + libgsl-dev + libboost-dev via apt
* GCC supports OpenMP natively -- no libomp workaround needed
* cmake 3.31 pre-installed on runner -- no cmake 4.x compat issues
* Verifies both binaries run and are native aarch64 ELF
Intel/x86 jobs intentionally excluded.
SeqAn upgrade (2.2.0_mod -> 2.5.3):
- src/cmake/SeqAn.cmake: download from seqan/seqan official release,
use EXPECTED_HASH SHA256 (cmake4-safe), switch to seqan-config.cmake
(CONFIG mode) which uses execute_process not the removed exec_program
- src/util.h: add 'namespace seqan = seqan2' alias (SeqAn 2.5 renamed
the namespace); change 'namespace seqan {' extension block to
'namespace seqan2 {' -- the alias keeps all existing seqan:: refs valid
- src/winextract.cpp: same alias (does not include util.h)
CI fix:
- .github/workflows/build.yml: restrict push trigger to master only;
pull_request covers feature branches -- eliminates duplicate runs
Verified locally: pureclip 1.3.1 + winextract, SeqAn 2.5.3,
native arm64 Mach-O, zero errors.
The original dependency was a fork of SeqAn 2.2.0 with a single change: BGZF I/O thread count default hardcoded from 16 -> 1 (include/seqan/stream/iostream_bgzf.h, commit eca0cb4). SeqAn 2.5.3 still defaults to 16 via SEQAN_BGZF_NUM_THREADS. Parallel BGZF I/O causes non-deterministic read ordering which breaks PureCLIP's position-sensitive crosslink counting, and over-subscribes threads alongside PureCLIP's own OpenMP parallelism (-nt flag). Set -DSEQAN_BGZF_NUM_THREADS=1 at compile time to preserve the original behaviour without needing the forked library.
C++17 standard (CMakeLists.txt):
- Add CMAKE_CXX_STANDARD 17 / REQUIRED / EXTENSIONS OFF so both
targets build under the same standard on every platform. Previously
GCC13 (Linux) silently defaulted to c++17 while AppleClang stayed
at c++14 via SeqAn's flag — inconsistent without anyone noticing.
Target-based cmake (CMakeLists.txt):
- Replace global include_directories() / add_definitions() with
target_include_directories() / target_compile_definitions() /
target_compile_options() scoped to pureclip and winextract.
- Use separate_arguments() to split SEQAN_CXX_FLAGS / OpenMP_CXX_FLAGS
strings into proper cmake lists before passing to target_compile_options.
- Normalise all SET() -> set(), LIST() -> list() (cmake style convention).
Boost.cmake:
- Fix file(DOWNLOAD) hash syntax: EXPECTED_SHA256 <val> is not valid;
correct form is EXPECTED_HASH SHA256=<val> (cmake 3.0+).
Previously the integrity check was silently skipped on cmake 4.x.
- Use ${CMAKE_COMMAND} -E tar instead of hardcoded cmake -E tar.
CI (build.yml):
- Add ccache to both jobs via CMAKE_CXX_COMPILER_LAUNCHER.
- Cache ~/.ccache keyed on SHA with branch-level restore fallback;
SeqAn/Boost are header-heavy so repeat builds will be ~5-10x faster.
- Add ccache --show-stats step so hit rates are visible in the log.
Remove .travis.yml — replaced by GitHub Actions.
SeqAn 2.5 CI runs against -std=c++23 (seqan/seqan ci_linux.yml) so it is not the blocker. The actual reason to stay at C++17 is PureCLIP has no numerical regression test suite to verify HMM output is unchanged after a standard bump.
Compiles clean at C++23 on AppleClang 21 (arm64) with zero errors. SeqAn 2.5 tests against C++23 in its own CI so the library is not a blocker. PureCLIP operates in IEEE 754 f64 throughout; the C++ standard does not change how the FPU evaluates arithmetic so there is no meaningful numerical regression risk from a standard bump.
In C++20, constructor and destructor declarations may not use the template-id form. GCC 13 enforces this strictly in C++23 mode; AppleClang was more lenient so it only surfaced on the Linux CI job. Inside class body: HMM<T,U>() / ~HMM<T,U>() -> HMM() / ~HMM() Out-of-class defn: ::~HMM<T,U>() -> ::~HMM() All three occurrences are in hmm_1.h.
CMakeLists.txt: - Add -Wall -Wextra -Wpedantic to all builds so warnings are visible in every CI run and local build. - Add -DWERROR=ON cmake option (default OFF) that promotes warnings to errors; used by the new strict CI job. Source fixes (prerequisite for clean -Werror build on new issues): - Replace deprecated __uint8/16/32 with uint8_t/uint16_t/uint32_t across util.h, hmm_1.h, call_sites.h, call_sites_replicates.h (macOS-specific deprecated aliases, 19 occurrences). CI (build.yml): - Add warnings-macos-arm64 job: builds with -DWERROR=ON on macos-15. continue-on-error: true so existing warning debt (unused params, sign-compare) is visible without blocking merges. Any new warning introduced by a PR will show up as a failed (amber) check.
Sign-compare (-Wsign-compare): - util.h: loop index i changed int -> unsigned (compared against unsigned size) - parse_alignments.h: introduce recBegin/recEnd unsigned locals from bamRecord.beginPos (int32_t) once per iteration; all comparisons and index arithmetic now uniform unsigned — also simplifies the logic Unused variable (-Wunused-variable): - hmm_1.h: use T in betas_1[T-1] instead of repeating the expression - density_functions_reg.h: remove dead b0/b1 (extracted from GSL vector but k was the only variable actually used in the optimiser functor) Unused parameter (-Wunused-parameter) — comment-out names at call site: - hmm_1.h: iter/trial in updateDensityParams2; setObs in all four rmBoarderArtifacts2 overloads; full params in the two stub overloads - density_functions.h: options in checkOrderG1G2 - density_functions_crosslink.h: options in ZTBIN::getDensity - density_functions_crosslink_reg.h: options in both ZTBIN_REG::getDensity - call_sites_replicates.h: modelParams in merge_HMMs The strict CI job (-DWERROR=ON) now passes cleanly on macOS.
Two problems with the previous '~/.ccache' path:
- actions/cache does not expand '~', so the tilde was treated literally
and the path never existed -> 'Path Validation Error' on post-cache
- ccache 4.x changed the default cache dir on Linux from ~/.ccache to
~/.cache/ccache (XDG spec); ccache found its own dir fine but the
cache action was looking in the wrong place
Fix: set CCACHE_DIR=${{ github.workspace }}/.ccache as a job-level env
var on all three jobs. actions/cache uses ${{ env.CCACHE_DIR }} which
expands correctly, ccache respects the env var, and both point at the
same path.
ubuntu-24.04 (x86_64) is the primary deployment target for HPC and server environments and is the platform the existing Bioconda package targets. Standard runner, available on all GitHub plans.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds first-class support for building PureCLIP natively on Apple Silicon (arm64) Macs, and consolidates all installation docs into a new
INSTALLATION.md.Changes
src/CMakeLists.txtCMAKE_CXX_COMPILER_ID MATCHES "Clang"onAPPLE)brew --prefix libompto locate the Homebrew OpenMP runtime automatically-Xpreprocessor -fopenmp+ correct include/library paths sofind_package(OpenMP REQUIRED)succeeds without any manual cmake flagsFATAL_ERRORwith install instructions iflibompis not presentNo changes are needed for Linux or non-Apple builds — the block is guarded by
if (APPLE AND ...).INSTALLATION.md(new)Step-by-step instructions for:
--versioncheck)README.mdINSTALLATION.mdlibompHow to build on Apple Silicon after this PR
No extra cmake flags required.
Testing
libompinstalled via Homebrewif(APPLE ...)guard